Introduction

Puerto Rico, known as a region susceptible to natural disasters, particularly hurricanes, faces significant challenges due to its tropical climate and island geography. The impacts of these disasters on the local population, infrastructure, and ecosystems are profound and often devastating. For example Hurricane Maria, caused catastrophic damages and an important humanitarian crisis, most of the population in the island suffered floods and lack of resources. In response to this, our team is using advanced Machine Learning models to identify areas at high risk of flooding. This initiative aims to provide early warnings to residents, enabling them to take essential precautions. By accurately predicting flood-prone zones, we can guide community members to safety and strategically mitigate the risk of casualties, thereby enhancing disaster preparedness and response efforts across the island.

Pre- and Post-Event NDVI Analysis

In [6]:
##Pre-modeling libraries

#Supressing warnings
import warnings

#Ignoring all warnings
warnings.filterwarnings('ignore')

#Importing GIS tools
import numpy                   as np
import xarray                  as xr
import matplotlib.pyplot       as plt
import rasterio.features
import rioxarray               as rio
from matplotlib.cm import RdYlGn, Reds

#Importing Planetary Computer tools
import pystac_client
import planetary_computer as pc
import odc
from odc.stac import stac_load

#Importing datetime functions
from datetime import date # date-related calculations

##Modeling libraries

#GeoTiff images
import rasterio
from   osgeo import gdal


#Importing data visualization libraries
from   matplotlib.pyplot import figure
import matplotlib.image  as img
from   PIL               import Image


#Importing libraries for model buildings
import ultralytics
from   ultralytics import YOLO
import labelme2yolo


#Other libraries
import os
import shutil
import zipfile
In [2]:
## Hurricane Maria - San Juan, Puerto Rico ##

#Defining the bounding box for the entire data region 
min_lon = -66.19385887
min_lat =  18.27306794
max_lon = -66.08007533
max_lat =  18.48024350


#Setting geographic boundary
bounds = (min_lon, min_lat, max_lon, max_lat)


#Setting time window
time_window = "2017-03-31/2018-01-01"
In [3]:
#Connecting to the planetary computer
stac = pystac_client.Client.open("https://planetarycomputer.microsoft.com/api/stac/v1")


#Seaching for data
search = stac.search(collections = ["sentinel-2-l2a"],
                     bbox        = bounds,
                     datetime    = time_window)


#Instantiating results list
items = list(search.get_all_items())
In [4]:
#Pixel resolution for the final product
resolution = 10  # meters per pixel 


#Scaling to degrees per pizel
scale = resolution / 111320.0 # degrees per pixel for CRS:4326 
In [5]:
#Stablishing xx
xx = stac_load(
    items,
    bands      = ["red", "green", "blue", "nir", "SCL"],
    crs        = "EPSG:4326",                            # latitude-longitude
    resolution = scale,                                  # degrees
    chunks     = {"x": 2048, "y": 2048},
    dtype      = "uint16",
    patch_url  = pc.sign,
    bbox       = bounds
)
In [6]:
#Instantiating a colormap for SCL pixel classifications

scl_colormap = np.array(
    [
        [252,  40, 228, 255],  # 0  - NODATA - MAGENTA
        [255,   0,   4, 255],  # 1  - Saturated or Defective - RED
        [0  ,   0,   0, 255],  # 2  - Dark Areas - BLACK
        [97 ,  97,  97, 255],  # 3  - Cloud Shadow - DARK GREY
        [3  , 139,  80, 255],  # 4  - Vegetation - GREEN
        [192, 132,  12, 255],  # 5  - Bare Ground - BROWN
        [21 , 103, 141, 255],  # 6  - Water - BLUE
        [117,   0,  27, 255],  # 7  - Unclassified - MAROON
        [208, 208, 208, 255],  # 8  - Cloud - LIGHT GREY
        [244, 244, 244, 255],  # 9  - Definitely Cloud - WHITE
        [195, 231, 240, 255],  # 10 - Thin Cloud - LIGHT BLUE
        [222, 157, 204, 255],  # 11 - Snow or Ice - PINK
    ],
    dtype="uint8",
)
In [7]:
#Filtering out water, etc.
filter_values = [0, 1, 3, 6, 8, 9, 10]

#Defining cloud mask for filter values
cloud_mask = ~xx.SCL.isin(filter_values) 
In [8]:
#Applying cloud mask to filter out water, clouds and cloud shadows

# storing as 16-bit integers
cleaned_data = xx.where(cloud_mask).astype("uint16")
In [9]:
#Preparing two time steps compare NDVI outputs
first_time  = 0  # 
second_time = 38 # 
first_time_2 = 11
second_time_2 = 30
first_time_3 = 18
second_time_3 = 39
In [10]:
#Plots of NDVI at two different time slices (1)

#Setting figure size
fig, ax = plt.subplots(1, 2, figsize = (15, 10))


#First image data
ndvi_image = (cleaned_data.nir - cleaned_data.red) / (cleaned_data.nir + cleaned_data.red)
ndvi_image.isel(time = first_time ).plot(ax = ax[0],
                                         vmin = 0.0,
                                         vmax = 0.8,
                                         cmap = "RdYlGn")


#Second image data
ndvi_image.isel(time = second_time).plot(ax = ax[1],
                                         vmin = 0.0,
                                         vmax = 0.8,
                                         cmap = "RdYlGn")


#Axis labels
ax[0].set_title(label = 'NDVI-Time #1'), ax[0].xaxis.set_visible(False), ax[0].yaxis.set_visible(False)
ax[1].set_title(label = 'NDVI-Time #2'), ax[1].xaxis.set_visible(False), ax[1].yaxis.set_visible(False)


#Showing the plot
plt.show()
No description has been provided for this image
In [11]:
#Plots of NDVI at two different time slices (2)

#Setting figure size
fig, ax = plt.subplots(1, 2, figsize = (15, 10))

#Third image data
ndvi_image = (cleaned_data.nir - cleaned_data.red) / (cleaned_data.nir + cleaned_data.red)
ndvi_image.isel(time = first_time_2 ).plot(ax = ax[0],
                                         vmin = 0.0,
                                         vmax = 0.8,
                                         cmap = "RdYlGn")


#Fourth image data
ndvi_image.isel(time = second_time_2).plot(ax = ax[1],
                                         vmin = 0.0,
                                         vmax = 0.8,
                                         cmap = "RdYlGn")

#Axis labels
ax[0].set_title(label = 'NDVI-Time #3'), ax[0].xaxis.set_visible(False), ax[0].yaxis.set_visible(False)
ax[1].set_title(label = 'NDVI-Time #4'), ax[1].xaxis.set_visible(False), ax[1].yaxis.set_visible(False)


#Showing the plot
plt.show()
No description has been provided for this image
In [12]:
#Plots of NDVI at two different time slices (3)

#Setting figure size
fig, ax = plt.subplots(1, 2, figsize = (15, 10))

#Fifth image data
ndvi_image = (cleaned_data.nir - cleaned_data.red) / (cleaned_data.nir + cleaned_data.red)
ndvi_image.isel(time = first_time_3 ).plot(ax = ax[0],
                                         vmin = 0.0,
                                         vmax = 0.8,
                                         cmap = "RdYlGn")


#Sixth image data
ndvi_image.isel(time = second_time_3).plot(ax = ax[1],
                                         vmin = 0.0,
                                         vmax = 0.8,
                                         cmap = "RdYlGn")

#Axis labels
ax[0].set_title(label = 'NDVI-Time #5'), ax[0].xaxis.set_visible(False), ax[0].yaxis.set_visible(False)
ax[1].set_title(label = 'NDVI-Time #6'), ax[1].xaxis.set_visible(False), ax[1].yaxis.set_visible(False)


#Showing the plot
plt.show()
No description has been provided for this image
In [13]:
#Function for calculating NDVI anomalies
def NDVI(dataset):
    return (dataset.nir - dataset.red) / (dataset.nir + dataset.red)
In [14]:
#Running comparison
ndvi_clean = NDVI(cleaned_data)
In [15]:
#Calculating difference (1)
ndvi_pre     = ndvi_clean.isel(time = first_time)
ndvi_post    = ndvi_clean.isel(time = second_time)
ndvi_anomaly = ndvi_post - ndvi_pre


#All areas of water or clouds will be black
RdYlGn.set_bad('black',1.)


#Reversing the colormap for reds
Reds_reverse = "Reds_r"
In [16]:
##Plotting NDVI anomaly (1)
plt.figure( figsize = (6,10) )
ndvi_anomaly.plot(vmin = -0.2, vmax=0.0, cmap = Reds_reverse, add_colorbar=False)


#Titles and labels
plt.title (label  = "NDVI Anomaly")
plt.xlabel(xlabel = "Longitude")
plt.ylabel(ylabel = "Latitude")


#Showing the plot
plt.show()
No description has been provided for this image
In [17]:
#Calculating difference (2)
ndvi_pre     = ndvi_clean.isel(time = first_time_2)
ndvi_post    = ndvi_clean.isel(time = second_time_2)
ndvi_anomaly = ndvi_post - ndvi_pre


#All areas of water or clouds will be black
RdYlGn.set_bad('black',1.)


#Reversing the colormap for reds
Reds_reverse = "Reds_r"
In [18]:
##Plotting NDVI anomaly (2)
plt.figure( figsize = (6,10) )
ndvi_anomaly.plot(vmin = -0.2, vmax=0.0, cmap = Reds_reverse, add_colorbar=False)


#Titles and labels
plt.title (label  = "NDVI Anomaly")
plt.xlabel(xlabel = "Longitude")
plt.ylabel(ylabel = "Latitude")


#Showing the plot
plt.show()
No description has been provided for this image
In [19]:
#Calculating difference (3)
ndvi_pre     = ndvi_clean.isel(time = first_time_3)
ndvi_post    = ndvi_clean.isel(time = second_time_3)
ndvi_anomaly = ndvi_post - ndvi_pre


#All areas of water or clouds will be black
RdYlGn.set_bad('black',1.)


#Reversing the colormap for reds
Reds_reverse = "Reds_r"
In [20]:
##Plotting NDVI anomaly (3)
plt.figure( figsize = (6,10) )
ndvi_anomaly.plot(vmin = -0.2, vmax=0.0, cmap = Reds_reverse, add_colorbar=False)


#Titles and labels
plt.title (label  = "NDVI Anomaly")
plt.xlabel(xlabel = "Longitude")
plt.ylabel(ylabel = "Latitude")


#Showing the plot
plt.show()
No description has been provided for this image

Actionable Recommendations

The first actionable recommendation is to increase funds for the “Asistencia para la Mitigación de Riesgos” (FEMA), a public program dedicated to mitigating losses during natural disasters. It would aim to incentivize the allocation of monetary resources to effectively carry out evacuations plans and ensure the availability of motivated personnel. By offering incentives within the program, individuals would be encouraged to prioritize evacuation preparedness. Such a program is intended to foster a culture of readiness and proactive response to potential emergencies or disasters. The objective of the program is to enhance state presence and strengthen community cooperation. (Gobierno de Puerto Rico)

The second actionable recommendation, directly tied with the first one, would be to establish an evacuation plan and gather emergency supplies. The purpose is to lead the residents that live near the coastline and areas at risk of river overflows to relocate in areas which are not directly affected by a sudden rise of sea-levels and firmer land. These areas are found along the city outskirts and appear less affected by the hurricane in general, but essential supplies such as drinkable water and food are not readily available. Determining accessible routes is an important aspect of this plan. In many cases, people reside in areas that would have immediate access to water, which makes this effort a priority. It is expected that introducing more preparedness strategies to the community will help reduce mental and physical detriment after the event, as seen by Joshipura et al. (2022)

Thirdly, as a consequence of the second actionable recommendation, it's crucial to spread the word about hurricane dangers. Whether through radio ads, social media or community outreach, getting the message out there is essential. Many people don't realize just how serious hurricanes can be, so educating them about the risks is key. This way, when evacuation plans spring into action, people will understand why it's so important to follow them. It's all about making sure everyone's on the same page when it comes to staying safe during a storm. Social media has proved to be an invaluable tool for communication during and after a natural disaster as studied by Bacon & Mujkic (2016), helping nonprofit organizations to act immediately providing the necessary assistance.

Object Detection (Image Labeling)

Image labeling was done using Label me, 25 pre-event images and 25 post-event images where labeled and saved into the "manually labeled" folder. These images were not used to train the model, instead we utilized a dataset from roboflow, the reference for it an be found in the README text files inside the zip file. The images downloaded from roboflow are most likely the results of another object detection model, given that some images are missing some clear labels and others have backgrounds labeled as buildings. The data set also used a different approach from ours to label the pictures, using rectangles that incorporate some of the background into them, while we used the manual tool to trace the edges of the buildings without incorporating the surroundings. Although the labeling in these images is not perfect it significantly improved our mAP score.

Model Building

In [7]:
#Loading the model
model = YOLO('yolov8n.pt')
In [ ]:
#Hyperparameter Tuning (This step was done using an External GPU so the outputs are not visible in this notebook)

#model.tune(data='/Users/juanmanumango/Desktop/Hult/Business Challenege III/Model/data.yaml', epochs=50, iterations=60, optimizer='SGD', val=False)
In [8]:
# Train the model on the dataset for 50 epochs (With best params from tuning)
results = model.train(data = './data.yaml',
                      seed            = 702,     #Seed to replicate results
                      imgsz           = 640,     #Image size
                      epochs          = 50,      #Number of epochs
                      optimizer       = 'SGD',    
                      lr0             = 0.00922, #Critical 
                      lrf             = 0.00707,
                      momentum        = 0.98,
                      weight_decay    = 0.00052,
                      warmup_epochs   = 3.0281,
                      warmup_momentum = 0.95,
                      box             = 8.78755, #Might improve mAP score
                      cls             = 0.44802, #Same
                      dfl             = 1.64183,
                      hsv_h           = 0.00859, #For different lighting
                      hsv_s           = 0.79619,
                      hsv_v           = 0.38164,
                      degrees         = 0.0,     #For objectives in different angles
                      translate       = 0.13664,
                      scale           = 0.33876, 
                      shear           = 0.0,
                      perspective     = 0.0,
                      flipud          = 0.0,
                      fliplr          = 0.40009,
                      bgr             = 0.0,
                      mosaic          = 0.87205, #Default 1
                      mixup           = 0,       #Thought this would be useful
                      copy_paste      = 0
                      )
New https://pypi.org/project/ultralytics/8.2.4 available 😃 Update with 'pip install -U ultralytics'
Ultralytics YOLOv8.2.0 🚀 Python-3.11.8 torch-2.2.2 CPU (Apple M2)
engine/trainer: task=detect, mode=train, model=yolov8n.pt, data=./data.yaml, epochs=50, time=None, patience=100, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=train63, exist_ok=False, pretrained=True, optimizer=SGD, verbose=True, seed=702, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, show_boxes=True, line_width=None, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=None, workspace=4, nms=False, lr0=0.00922, lrf=0.00707, momentum=0.98, weight_decay=0.00052, warmup_epochs=3.0281, warmup_momentum=0.95, warmup_bias_lr=0.1, box=8.78755, cls=0.44802, dfl=1.64183, pose=12.0, kobj=1.0, label_smoothing=0.0, nbs=64, hsv_h=0.00859, hsv_s=0.79619, hsv_v=0.38164, degrees=0.0, translate=0.13664, scale=0.33876, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.40009, bgr=0.0, mosaic=0.87205, mixup=0, copy_paste=0, auto_augment=randaugment, erasing=0.4, crop_fraction=1.0, cfg=None, tracker=botsort.yaml, save_dir=/Users/juanmanumango/runs/detect/train63
Overriding model.yaml nc=80 with nc=4

                   from  n    params  module                                       arguments                     
  0                  -1  1       464  ultralytics.nn.modules.conv.Conv             [3, 16, 3, 2]                 
  1                  -1  1      4672  ultralytics.nn.modules.conv.Conv             [16, 32, 3, 2]                
  2                  -1  1      7360  ultralytics.nn.modules.block.C2f             [32, 32, 1, True]             
  3                  -1  1     18560  ultralytics.nn.modules.conv.Conv             [32, 64, 3, 2]                
  4                  -1  2     49664  ultralytics.nn.modules.block.C2f             [64, 64, 2, True]             
  5                  -1  1     73984  ultralytics.nn.modules.conv.Conv             [64, 128, 3, 2]               
  6                  -1  2    197632  ultralytics.nn.modules.block.C2f             [128, 128, 2, True]           
  7                  -1  1    295424  ultralytics.nn.modules.conv.Conv             [128, 256, 3, 2]              
  8                  -1  1    460288  ultralytics.nn.modules.block.C2f             [256, 256, 1, True]           
  9                  -1  1    164608  ultralytics.nn.modules.block.SPPF            [256, 256, 5]                 
 10                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 11             [-1, 6]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 12                  -1  1    148224  ultralytics.nn.modules.block.C2f             [384, 128, 1]                 
 13                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 14             [-1, 4]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 15                  -1  1     37248  ultralytics.nn.modules.block.C2f             [192, 64, 1]                  
 16                  -1  1     36992  ultralytics.nn.modules.conv.Conv             [64, 64, 3, 2]                
 17            [-1, 12]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 18                  -1  1    123648  ultralytics.nn.modules.block.C2f             [192, 128, 1]                 
 19                  -1  1    147712  ultralytics.nn.modules.conv.Conv             [128, 128, 3, 2]              
 20             [-1, 9]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 21                  -1  1    493056  ultralytics.nn.modules.block.C2f             [384, 256, 1]                 
 22        [15, 18, 21]  1    752092  ultralytics.nn.modules.head.Detect           [4, [64, 128, 256]]           
Model summary: 225 layers, 3011628 parameters, 3011612 gradients, 8.2 GFLOPs

Transferred 319/355 items from pretrained weights
Freezing layer 'model.22.dfl.conv.weight'
train: Scanning /Users/juanmanumango/Desktop/A1_Team2/labels/train... 200 images
train: New cache created: /Users/juanmanumango/Desktop/A1_Team2/labels/train.cache
val: Scanning /Users/juanmanumango/Desktop/A1_Team2/labels/val... 55 images, 0 b
val: New cache created: /Users/juanmanumango/Desktop/A1_Team2/labels/val.cache
WARNING ⚠️ Box and segment counts should be equal, but got len(segments) = 192, len(boxes) = 914. To resolve this only boxes will be used and all segments will be removed. To avoid this please supply either a detect or segment dataset, not a detect-segment mixed dataset.

Plotting labels to /Users/juanmanumango/runs/detect/train63/labels.jpg... 
optimizer: SGD(lr=0.00922, momentum=0.98) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.00052), 63 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 0 dataloader workers
Logging results to /Users/juanmanumango/runs/detect/train63
Starting training for 50 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
       1/50         0G      2.042      3.722      1.912        697        640:  
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
Cell In[8], line 2
      1 # Train the model on the dataset for 50 epochs (With best params from tuning)
----> 2 results = model.train(data = './data.yaml',
      3                       seed            = 702,     #Seed to replicate results
      4                       imgsz           = 640,     #Image size
      5                       epochs          = 50,      #Number of epochs
      6                       optimizer       = 'SGD',    
      7                       lr0             = 0.00922, #Critical 
      8                       lrf             = 0.00707,
      9                       momentum        = 0.98,
     10                       weight_decay    = 0.00052,
     11                       warmup_epochs   = 3.0281,
     12                       warmup_momentum = 0.95,
     13                       box             = 8.78755, #Might improve mAP score
     14                       cls             = 0.44802, #Same
     15                       dfl             = 1.64183,
     16                       hsv_h           = 0.00859, #For different lighting
     17                       hsv_s           = 0.79619,
     18                       hsv_v           = 0.38164,
     19                       degrees         = 0.0,     #For objectives in different angles
     20                       translate       = 0.13664,
     21                       scale           = 0.33876, 
     22                       shear           = 0.0,
     23                       perspective     = 0.0,
     24                       flipud          = 0.0,
     25                       fliplr          = 0.40009,
     26                       bgr             = 0.0,
     27                       mosaic          = 0.87205, #Default 1
     28                       mixup           = 0,       #Thought this would be useful
     29                       copy_paste      = 0
     30                       )

File ~/anaconda3/lib/python3.11/site-packages/ultralytics/engine/model.py:673, in Model.train(self, trainer, **kwargs)
    670             pass
    672 self.trainer.hub_session = self.session  # attach optional HUB session
--> 673 self.trainer.train()
    674 # Update model and cfg after training
    675 if RANK in {-1, 0}:

File ~/anaconda3/lib/python3.11/site-packages/ultralytics/engine/trainer.py:198, in BaseTrainer.train(self)
    195         ddp_cleanup(self, str(file))
    197 else:
--> 198     self._do_train(world_size)

File ~/anaconda3/lib/python3.11/site-packages/ultralytics/engine/trainer.py:370, in BaseTrainer._do_train(self, world_size)
    368 with torch.cuda.amp.autocast(self.amp):
    369     batch = self.preprocess_batch(batch)
--> 370     self.loss, self.loss_items = self.model(batch)
    371     if RANK != -1:
    372         self.loss *= world_size

File ~/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py:1511, in Module._wrapped_call_impl(self, *args, **kwargs)
   1509     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1510 else:
-> 1511     return self._call_impl(*args, **kwargs)

File ~/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py:1520, in Module._call_impl(self, *args, **kwargs)
   1515 # If we don't have any hooks, we want to skip the rest of the logic in
   1516 # this function, and just call forward.
   1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1518         or _global_backward_pre_hooks or _global_backward_hooks
   1519         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1520     return forward_call(*args, **kwargs)
   1522 try:
   1523     result = None

File ~/anaconda3/lib/python3.11/site-packages/ultralytics/nn/tasks.py:88, in BaseModel.forward(self, x, *args, **kwargs)
     78 """
     79 Forward pass of the model on a single scale. Wrapper for `_forward_once` method.
     80 
   (...)
     85     (torch.Tensor): The output of the network.
     86 """
     87 if isinstance(x, dict):  # for cases of training and validating while training.
---> 88     return self.loss(x, *args, **kwargs)
     89 return self.predict(x, *args, **kwargs)

File ~/anaconda3/lib/python3.11/site-packages/ultralytics/nn/tasks.py:267, in BaseModel.loss(self, batch, preds)
    264     self.criterion = self.init_criterion()
    266 preds = self.forward(batch["img"]) if preds is None else preds
--> 267 return self.criterion(preds, batch)

File ~/anaconda3/lib/python3.11/site-packages/ultralytics/utils/loss.py:222, in v8DetectionLoss.__call__(self, preds, batch)
    219 # Pboxes
    220 pred_bboxes = self.bbox_decode(anchor_points, pred_distri)  # xyxy, (b, h*w, 4)
--> 222 _, target_bboxes, target_scores, fg_mask, _ = self.assigner(
    223     pred_scores.detach().sigmoid(),
    224     (pred_bboxes.detach() * stride_tensor).type(gt_bboxes.dtype),
    225     anchor_points * stride_tensor,
    226     gt_labels,
    227     gt_bboxes,
    228     mask_gt,
    229 )
    231 target_scores_sum = max(target_scores.sum(), 1)
    233 # Cls loss
    234 # loss[1] = self.varifocal_loss(pred_scores, target_scores, target_labels) / target_scores_sum  # VFL way

File ~/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py:1511, in Module._wrapped_call_impl(self, *args, **kwargs)
   1509     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1510 else:
-> 1511     return self._call_impl(*args, **kwargs)

File ~/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py:1520, in Module._call_impl(self, *args, **kwargs)
   1515 # If we don't have any hooks, we want to skip the rest of the logic in
   1516 # this function, and just call forward.
   1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1518         or _global_backward_pre_hooks or _global_backward_hooks
   1519         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1520     return forward_call(*args, **kwargs)
   1522 try:
   1523     result = None

File ~/anaconda3/lib/python3.11/site-packages/torch/utils/_contextlib.py:115, in context_decorator.<locals>.decorate_context(*args, **kwargs)
    112 @functools.wraps(func)
    113 def decorate_context(*args, **kwargs):
    114     with ctx_factory():
--> 115         return func(*args, **kwargs)

File ~/anaconda3/lib/python3.11/site-packages/ultralytics/utils/tal.py:72, in TaskAlignedAssigner.forward(self, pd_scores, pd_bboxes, anc_points, gt_labels, gt_bboxes, mask_gt)
     63     device = gt_bboxes.device
     64     return (
     65         torch.full_like(pd_scores[..., 0], self.bg_idx).to(device),
     66         torch.zeros_like(pd_bboxes).to(device),
   (...)
     69         torch.zeros_like(pd_scores[..., 0]).to(device),
     70     )
---> 72 mask_pos, align_metric, overlaps = self.get_pos_mask(
     73     pd_scores, pd_bboxes, gt_labels, gt_bboxes, anc_points, mask_gt
     74 )
     76 target_gt_idx, fg_mask, mask_pos = self.select_highest_overlaps(mask_pos, overlaps, self.n_max_boxes)
     78 # Assigned target

File ~/anaconda3/lib/python3.11/site-packages/ultralytics/utils/tal.py:92, in TaskAlignedAssigner.get_pos_mask(self, pd_scores, pd_bboxes, gt_labels, gt_bboxes, anc_points, mask_gt)
     90 def get_pos_mask(self, pd_scores, pd_bboxes, gt_labels, gt_bboxes, anc_points, mask_gt):
     91     """Get in_gts mask, (b, max_num_obj, h*w)."""
---> 92     mask_in_gts = self.select_candidates_in_gts(anc_points, gt_bboxes)
     93     # Get anchor_align metric, (b, max_num_obj, h*w)
     94     align_metric, overlaps = self.get_box_metrics(pd_scores, pd_bboxes, gt_labels, gt_bboxes, mask_in_gts * mask_gt)

File ~/anaconda3/lib/python3.11/site-packages/ultralytics/utils/tal.py:229, in TaskAlignedAssigner.select_candidates_in_gts(xy_centers, gt_bboxes, eps)
    227 bbox_deltas = torch.cat((xy_centers[None] - lt, rb - xy_centers[None]), dim=2).view(bs, n_boxes, n_anchors, -1)
    228 # return (bbox_deltas.min(3)[0] > eps).to(gt_bboxes.dtype)
--> 229 return bbox_deltas.amin(3).gt_(eps)

KeyboardInterrupt: 

The model was also ran using the external GPU to see the outputs of this model please refer to the SGD.ipynb file. Nevertheless the model should still run without any issues using a small sample of the bigger dataset used in our actual model.

Model Analysis

In [3]:
#Plotting model results

figure(figsize=(15, 10), dpi=80)
# reading the image 
results = img.imread(fname = './train124/results.png', format = 'png') # change this as needed
# displaying the image 
plt.imshow(results)
#plt.axis('off')  # Turn off axis numbers and ticks
plt.show()
No description has been provided for this image

The results for the model where excellent, the best mAP score ended up being 0.789 (This can be seen in the SGD.ipynb file). This was mainly due to the quality of our data. The dataset downloaded from roboflow provided us with a large amount of training data that was labeled almost perfectly. This was the biggest factor in improving our score. Hyperparemeter tuning ended up not being as significant as we would have expected, only improving our model by less than 10%. With more time, we could executed a better tuning process that might have significantly improved our model.

In [4]:
#Plotting confusion matrix

figure(figsize=(20,15), dpi=80)  
# reading the image 
cf = img.imread('./train124/confusion_matrix.png') # change this as needed
# displaying the image 
plt.imshow(cf)
plt.show()
No description has been provided for this image

Looking at the confusion matrix we can observe the areas of improvement in the model. Overall the predictions were good but there were a lot of background points that were labeled as buildings. In addition, there are a couple of false predictions regarding residential buildings. Adjusting class weights to be less sensitive would improve our precision score and therefore improve our mAP score.

Making Predictions on the Submission Data

In [52]:
#Loading best model
model = YOLO('./train124/weights/best.pt') # change this as needed
In [54]:
# # Decoding according to the .yaml file class names order
# decoding_of_predictions ={0: 'undamagedcommercialbuilding', 1: 'undamagedresidentialbuilding', 2: 'damagedresidentialbuilding', 3: 'damagedcommercialbuilding'}

# directory = './Submission Data (12 images)'
# # Directory to store outputs
# results_directory = 'Validation_Data_Results'

# # Create submission directory if it doesn't exist
# if not os.path.exists(results_directory):
#     os.makedirs(results_directory)

# # Loop through each file in the directory
# for filename in os.listdir(directory):
#     # Check if the current object is a file and ends with .jpeg
#     if os.path.isfile(os.path.join(directory, filename)) and filename.lower().endswith('.jpg'):
#         # Perform operations on the file
#         file_path = os.path.join(directory, filename)
#         print(file_path)
#         print("Making a prediction on ", filename)
#         results = model.predict(file_path, save=True, iou=0.5, save_txt=True, conf=0.25)
        
#         for r in results:
#             conf_list = r.boxes.conf.numpy().tolist()
#             clss_list = r.boxes.cls.numpy().tolist()
#             original_list = clss_list
#             updated_list = []
#             for element in original_list:
#                  updated_list.append(decoding_of_predictions[int(element)])

#         bounding_boxes = r.boxes.xyxy.numpy()
#         confidences = conf_list
#         class_names = updated_list

#         # Check if bounding boxes, confidences and class names match
#         if len(bounding_boxes) != len(confidences) or len(bounding_boxes) != len(class_names):
#             print("Error: Number of bounding boxes, confidences, and class names should be the same.")
#             continue
#         text_file_name = os.path.splitext(filename)[0]
#         # Creating a new .txt file for each image in the submission_directory
#         with open(os.path.join(results_directory, f"{text_file_name}.txt"), "w") as file:
#             for i in range(len(bounding_boxes)):
#                 # Get coordinates of each bounding box
#                 left, top, right, bottom = bounding_boxes[i]
#                 # Write content to file in desired format
#                 file.write(f"{class_names[i]} {confidences[i]} {left} {top} {right} {bottom}\n")
#         print("Output files generated successfully.")
./Submission Data (12 images)/Validation_Post_Event_009.jpg
Making a prediction on  Validation_Post_Event_009.jpg

image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_009.jpg: 640x640 6 damagedcommercialbuildings, 5 damagedresidentialbuildings, 12 undamagedcommercialbuildings, 9 undamagedresidentialbuildings, 88.9ms
Speed: 4.5ms preprocess, 88.9ms inference, 1.3ms postprocess per image at shape (1, 3, 640, 640)
Results saved to /Users/juanmanumango/runs/detect/predict3
12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels
Output files generated successfully.
./Submission Data (12 images)/Validation_Post_Event_008.jpg
Making a prediction on  Validation_Post_Event_008.jpg

image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_008.jpg: 640x640 2 damagedcommercialbuildings, 7 damagedresidentialbuildings, 2 undamagedcommercialbuildings, 14 undamagedresidentialbuildings, 76.0ms
Speed: 1.6ms preprocess, 76.0ms inference, 0.3ms postprocess per image at shape (1, 3, 640, 640)
Results saved to /Users/juanmanumango/runs/detect/predict3
12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels
Output files generated successfully.
./Submission Data (12 images)/Validation_Post_Event_011.jpg
Making a prediction on  Validation_Post_Event_011.jpg

image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_011.jpg: 640x640 3 damagedcommercialbuildings, 6 damagedresidentialbuildings, 7 undamagedcommercialbuildings, 6 undamagedresidentialbuildings, 77.7ms
Speed: 1.7ms preprocess, 77.7ms inference, 0.3ms postprocess per image at shape (1, 3, 640, 640)
Results saved to /Users/juanmanumango/runs/detect/predict3
12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels
Output files generated successfully.
./Submission Data (12 images)/Validation_Post_Event_005.jpg
Making a prediction on  Validation_Post_Event_005.jpg

image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_005.jpg: 640x640 10 damagedresidentialbuildings, 2 undamagedcommercialbuildings, 37 undamagedresidentialbuildings, 78.6ms
Speed: 1.7ms preprocess, 78.6ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 640)
Results saved to /Users/juanmanumango/runs/detect/predict3
12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels
Output files generated successfully.
./Submission Data (12 images)/Validation_Post_Event_004.jpg
Making a prediction on  Validation_Post_Event_004.jpg

image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_004.jpg: 640x640 5 damagedresidentialbuildings, 12 undamagedresidentialbuildings, 82.3ms
Speed: 1.7ms preprocess, 82.3ms inference, 0.3ms postprocess per image at shape (1, 3, 640, 640)
Results saved to /Users/juanmanumango/runs/detect/predict3
12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels
Output files generated successfully.
./Submission Data (12 images)/Validation_Post_Event_010.jpg
Making a prediction on  Validation_Post_Event_010.jpg

image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_010.jpg: 640x640 9 damagedcommercialbuildings, 4 damagedresidentialbuildings, 9 undamagedcommercialbuildings, 5 undamagedresidentialbuildings, 82.7ms
Speed: 1.7ms preprocess, 82.7ms inference, 0.3ms postprocess per image at shape (1, 3, 640, 640)
Results saved to /Users/juanmanumango/runs/detect/predict3
12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels
Output files generated successfully.
./Submission Data (12 images)/Validation_Post_Event_006.jpg
Making a prediction on  Validation_Post_Event_006.jpg

image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_006.jpg: 640x640 6 damagedcommercialbuildings, 14 damagedresidentialbuildings, 4 undamagedcommercialbuildings, 32 undamagedresidentialbuildings, 119.2ms
Speed: 1.8ms preprocess, 119.2ms inference, 0.7ms postprocess per image at shape (1, 3, 640, 640)
Results saved to /Users/juanmanumango/runs/detect/predict3
12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels
Output files generated successfully.
./Submission Data (12 images)/Validation_Post_Event_012.jpg
Making a prediction on  Validation_Post_Event_012.jpg

image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_012.jpg: 640x640 2 damagedcommercialbuildings, 12 damagedresidentialbuildings, 7 undamagedcommercialbuildings, 22 undamagedresidentialbuildings, 79.2ms
Speed: 2.3ms preprocess, 79.2ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 640)
Results saved to /Users/juanmanumango/runs/detect/predict3
12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels
Output files generated successfully.
./Submission Data (12 images)/Validation_Post_Event_007.jpg
Making a prediction on  Validation_Post_Event_007.jpg

image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_007.jpg: 640x640 1 damagedcommercialbuilding, 26 damagedresidentialbuildings, 2 undamagedcommercialbuildings, 28 undamagedresidentialbuildings, 98.2ms
Speed: 1.6ms preprocess, 98.2ms inference, 0.5ms postprocess per image at shape (1, 3, 640, 640)
Results saved to /Users/juanmanumango/runs/detect/predict3
12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels
Output files generated successfully.
./Submission Data (12 images)/Validation_Post_Event_003.jpg
Making a prediction on  Validation_Post_Event_003.jpg

image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_003.jpg: 640x640 12 damagedresidentialbuildings, 12 undamagedresidentialbuildings, 102.1ms
Speed: 1.7ms preprocess, 102.1ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 640)
Results saved to /Users/juanmanumango/runs/detect/predict3
12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels
Output files generated successfully.
./Submission Data (12 images)/Validation_Post_Event_002.jpg
Making a prediction on  Validation_Post_Event_002.jpg

image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_002.jpg: 640x640 1 damagedcommercialbuilding, 11 damagedresidentialbuildings, 10 undamagedresidentialbuildings, 91.4ms
Speed: 3.1ms preprocess, 91.4ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 640)
Results saved to /Users/juanmanumango/runs/detect/predict3
12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels
Output files generated successfully.
./Submission Data (12 images)/Validation_Post_Event_001.jpg
Making a prediction on  Validation_Post_Event_001.jpg

image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_001.jpg: 640x640 18 damagedresidentialbuildings, 1 undamagedcommercialbuilding, 34 undamagedresidentialbuildings, 88.5ms
Speed: 1.6ms preprocess, 88.5ms inference, 0.5ms postprocess per image at shape (1, 3, 640, 640)
Results saved to /Users/juanmanumango/runs/detect/predict3
12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels
Output files generated successfully.

The submission results can be found in the zip file as under the "Best Model Results" folder.

Conclusion

After finishing this project, the most notorious insights that we encountered were as follows:

  1. According the National Oceanic and Atmospheric Administration, within an 18 hour period, Hurricane Maria became a category 5 hurricane. This rapid intensification highlights the need for swift response mechanisms in hurricane-prone regions.

  2. Based on the findings on our code, the buildings in San Juan that suffered the most damage were commercial. Of a total of 14,997 residential buildings and 2,411 comercial buildings, 14% of residential buildings were damaged and 19% of commercial buildings were damaged. This shows the vulnerability of infrastructure and the need for robust building codes and construction practices within the area.

  3. As reported by RAND Puerto Rico’s municipal governments were unprepared for a disaster of this magnitude. This lack of preparedness shows the effects of the hurricane and hindered effective response and recovery efforts.

Based on these insights, our team proposes several actionable recommendations to enhance disaster preparedness and response in Puerto Rico. These include refining the existing evacuation plans to be more region-specific, which would allow for more targeted and efficient evacuations. Also, we suggest implementing area-specific disaster management plans that address the unique needs and vulnerabilities of different regions and their people. As a final recommendation, an educational program should be established to raise awareness about disaster preparedness and provide training on how to effectively respond to emergencies. Such initiatives are crucial for building resilience and ensuring a more robust response to future natural disasters. By implementing these strategies, we believe Puerto Rico can better safeguard its communities and infrastructure against the inevitable challenges posed by future storms.

Steps we would take to improve the model if we had more time

Enhance data collection:

We would expand our data collection efforts to include not only NDVI but also other relevant indices like soil moisture and land surface temperature. This would provide a more comprehensive dataset for analysis, allowing for better predictions and insights into the effects of hurricanes on different types of terrain and vegetation.

Refine our predictive model:

The main challenge for us was getting used to the tools and methods necessary for this assignment. Learning how to utilize external GPU's and to use completely new functions to us such as YOLO required a lot of time practicing and understanding. With more time we believe we could have done a much better job tuning our model. Using tools such as ray tune, which we didn't manage to get working properly, could have substantially improved our model's performance.

Adjusting class weights is another improvement we believe would critically increase our score, making the model less sensitive towards some classes would lead to more accurate predictions. If we counted with more time we are confident we could have added this feature to our model.

Apart from these more technical improvements if we counted with 3 months to work on this project the ideal situation would have been to manually label more images and use those instead of the roboflow dataset. Although the dataset we found seems to be good enough, we would have prefer to do this work ourselves to prevent any mistakes or error the people who developed the dataset might have committed.

Build partnerships:

We would strengthen our ties with local governments, non-governmental organizations, and community leaders. These partnerships would be key in turning our data into actionable plans that can directly benefit the people most at risk.

Bibliography

Bacon Brengarth, L., & Mujkic, E. (2016). WEB 2.0: How social media applications leverage nonprofit responses during a wildfire crisis. Computers in Human Behavior, 54, 589-596. https://doi.org/10.1016/j.chb.2015.07.010.

Joshipura, K. J., Martínez-Lozano, M., Ríos-Jiménez, P. I., Camacho-Monclova, D. M., Noboa-Ramos, C., Alvarado-González, G. A., & Lowe, S. R. (2022). Preparedness, hurricanes Irma and Maria, and impact on health in Puerto Rico. International Journal of Disaster Risk Reduction, 67, 102657.

Gobierno de Puerto Rico. (n.d.). Mitigación de riesgos. Recuperación de Puerto Rico. https://recovery.pr.gov/es/recovery-programs/hazard-mitigation-assistance

Rand Corporation. (2018). Hurricanes Irma and Maria: Impact and Aftermath. Www.rand.org. https://www.rand.org/hsrd/hsoac/projects/puerto-rico-recovery/hurricanes-irma-and-maria.html

US Department of Commerce, NOAA, National Weather Service. (2017, September 20). Major Hurricane Maria - September 20, 2017. Weather.gov. https://www.weather.gov/sju/maria2017

Feedback for EY

The challenge at hand was a very interesting one that we would've loved to explore more without such a harsh time constraint. As a team, we found it very appealing to work with object detection even if it was completely new to all of us. If we were to mention an improvement, it would be to try optimizing model training by adjusting epochs and adopting transfer learning can significantly reduce runtime. Also, in our case, we would like to explore with more object detection tools, maybe focus on different types of building depending on the area, and using classes for them to decide on the level of damage and urgency of repair. This would definitely reduce the scope to put focus on the structures that are more dangerous and need more of an immediate assistance.

Besides this, we enjoyed a lot working with satellite imagery, as being completely hands on with what we were doing. It gave us a feeling of acting with a global purpose for the greater good. In summary, while we faced time constraints and the challenge of adopting new technologies, our team's commitment to innovation and social impact has only grown stronger. We look forward to refining our approach, furthering our technical capabilities, and continuing to contribute to meaningful projects with significant global relevance.